Skip to main content

Analyzing security event logs

3 Tasks

1 hr

Visible to: All users
Advanced Pega Platform '23 English

Scenario

MDC wants to ensure the security of its Delivery Service application by enforcing geolocation for all users. To detect potential security breaches, MDC will log each user's pxRequestor, pxLatitude and pxLongitude, along with a timestamp for each event. The timestamp will be used to calculate the velocity between two events, which will be expressed in either miles-per-hour (mph) or kilometers-per-hour (kph). If the velocity between two events exceeds a reasonable value, MDC will output the two events to identify the user and location of the potential security breach.

To accomplish this, MDC will implement a method to efficiently compute the velocity between two events recorded in a PegaRULES-SecurityEvent.log file for the same user. It is unnecessary to deny access to the Delivery Service application if geolocation is not enabled.

Note: Completion of this challenge requires the use of the Linux Lite VM.

The following table provides the credentials you need to log in to the Delivery Service application. However, this challenge is mainly meant for evaluating the design options, and there are no specific implementation tasks. 

Role User name Password
Admin admin@deliveryservice rules

You must initiate your own Pega instance to complete this Challenge.

Initialization may take up to 5 minutes so please be patient.

Detailed Tasks

1 Review solution detail

The solution to this exercise requires the installation of Apache Spark and Rumble. In the following steps, you must: 

  • Read the installation instructions for Rumble on the RumbleDB documentation website.
  • Download Apache Spark.
  • Download the latest Rumble JAR file from GitHub.

2 Install and configure Apache Spark and Rumble

  1. Download Apache Spark using a browser that runs inside the VM. Move the downloaded file to /usr/local/bin by running the following command:
    tar zxvf spark-2.4.4-bin-hadoop2.7.tgz
    sudo mv spark-2.4.4-bin-hadoop2.7 /usr/local/bin
  2. Add Apache Spark to your PATH environment variable by running the following command:
    export SPARK_HOME=/usr/local/bin/spark-2.4.4-bin-hadoop2.7
    export PATH=$SPARK_HOME/bin:$PATH
  3. Run the following command to verify the steps:
    cd $SPARK_HOME/bin
    spark-submit –version
    cd $HOME
    spark-submit –version
  4. Use a browser running inside the VM to download Rumble.
  5. Create a directory from where you can run Rumble. For example, $HOME/pega8/rumble.
  6. Move the spark-rumble-x.x.jar file that you downloaded into the directory that you created.

Define the Rule-Utility-Functions in a Rule-Utility-Library named Security

Security • LogCustomEventValueGroup() Void

  • Parameters: eventType String, outcome String, message String, customFlds ClipboardProperty
  • Description: customFlds must be a Text ValueGroup. Call this Function when you want to log a custom Security Event

Security • NumericTimestamp() String

  • Parameters: unit String, allowed values = "h", "m", "s"
  • Description: Returns the current timestamp in units of hours, minutes, or seconds as a String

Override the MDC-DS-Work OpenDefaults Extension Point

OpenDefaults

Enable Geo-Location and Custom Security Events logging

The Chrome browser does not support Geolocation requests unless the server is secure. Chrome allows Geolocation requests within a Linux VM if its IP address is mapped to the name localhost. For example:

<code>pega8@Lubuntu1:~$ more /etc/hosts</code>
<code>192.168.118.145 localhost</code>
<code>127.0.0.1 localhost</code>
<code>127.0.1.1 Lubuntu1</code>
Tip: For best results, use Firefox as your browser. 
  1. Enter the ifconfig IP address as the first entry for localhost.
  2. Test by entering ping localhost.

    The pyGeolocationTrackingIsEnabled When rule must return true.

    The When rule applied to Data-Portal displays the following window after you log in:
Location access

 

  1. Click Allow Location Access.
  2. Check for the pxLatitiude and pxLongitude set within the pxRequestor page on your Clipboard.
  3. Click Security > Tools > Security > Security Event Configuration to launch the Security Event Configuration landing page.
  4. At the bottom of the landing page, click ON to enable custom events.
  5. Click Submit.
    Custom event
  6. Open an Event case for review, and check pyWorkPage on the Clipboard or Trace OpenDefaults to see the values for the customFlds Text Value Group, as shown in the following figure:
    Custom fields
  7. Click System > Operations > Logs.
  8. To the right of SECURITYEVENT, click Log Files.
  9. If challenged, enter the user name admin and the password admin.
  10. Open the SecurityEvent.log file to see whether the customFlds values have been recorded.

Remove Quotes Around Numeric Values (necessary for JSON format)

Use the following steps to remove quotes around numeric values:

  1. Copy and paste the text that you want the regular expression to examine.
  2. Replace each numeric value with (-?[0-9]*\.[0-9]*) if the value can be negative, or ([0-9]*\.[0-9]*) if the value cannot be negative.
  3. Use \1, \2, and so on, to specify which parsed value goes where.
Note: A slash does not need to separate the find clause from the replace clause. Whatever character is displayed after the initial "s" becomes the statement's delimiter. The statement must end with that delimiter and be used in the middle of the statement when separating find from replace.
BEFORE text.txt
<samp>{"id":"25538d8d-2861-40a9-b6df-d289e4b73a7e","eventCategory":"Custom event","eventType":"FooBla","appName":"Delivery Service","tenantID":"shared","ipAddress":"127.0.0.1","timeStamp":"Mon 2019 Aug 05, 19:31:42:060","operatorID":"Admin@deliveryservice","nodeID":"ff9ef7835fd4906aea82694c981938d0","outcome":"Fail","message":"FooBla failed","requestorIdentity":"20190805T192510","lon":"-98.0315889","lat":"30.123275"}</samp>
<samp>cat text.txt | sed -r 's/"lon":"(-?[0-9]*\.[0-9]*)","lat":"(-?[0-9]*\.[0-9]*)"}/"lon":\1,"lat":\2}/'</samp>
AFTER text.txt
<samp>{"id":"25538d8d-2861-40a9-b6df-d289e4b73a7e","eventCategory":"Custom event","eventType":"FooBla","appName":"Delivery Service","tenantID":"shared","ipAddress":"127.0.0.1","timeStamp":"Mon 2019 Aug 05, 19:31:42:060","operatorID":"Admin@deliveryservice","nodeID":"ff9ef7835fd4906aea82694c981938d0","outcome":"Fail","message":"FooBla failed","requestorIdentity":"20190805T192510","lon":-98.0315889,"lat":30.123275}</samp>

Replace newlines within an entire file, not one line at a time

If newlines exist in a file read by a Unix script, each line is processed individually. You must treat the entire file as a single large String, and then replace the new lines within that String.

The output of the newline-removing sed command is piped to sed again, this time to replace: }<space>{ with: }<comma>{

<samp>sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/ /g' PegaRULES-SecurityEvent.log | sed 's/} {/},{/g'</samp>

Have one file, test2.jq, act as the start of the query including the JSON array left bracket: let $log := [

Have a second file, restof_test1.jq, contain the end of the JSONiq query starting at the JSON array's right bracket.

  1. The script initializes a new query file using > test2.jq
  2. Output the back-to-back sed commands using >> test2.jq
  3. Append the content of restof_test1.jq to test2.jq by using the following commands:
JSONLogToArray.sh
<samp>#!/usr/bin/env bash
set -x

echo "let \$log := [" > test2.jq

cat PegaRULES-SecurityEvent.log
| sed -r 's/"lon":"(-?[0-9]*\.[0-9]*)"/"lon":\1/' \
| sed -r 's/"lat":"(-?[0-9]*\.[0-9]*)"/"lat":\1/' \
| sed -r 's/"ts":"([0-9]*\.[0-9]*)"/"ts":\1/' \
| sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/ /g' \ | sed 's/} {/},{/g' >> test2.jq

cat restof_test1.jq >> test2.jq</samp>

Next is the remainder of the JSONiq query (restof_test1.jq). The distance between two coordinates is computed in miles (3958.8 is the radius of the earth in miles), and then divided by the time difference between the two log file entries. The timestamp is computed as hours since 1-1-1970 GMT.

The filter condition ensures pairs of rows are only examined when:

  1. The eventType is Open Work (see the MDC-DS-Work OpenDefaults override above)
  2. The operatorID values within the two records are identical.
  3. The latitudes in both rows are > 20 (this is used to ensure that both rows contain geolocation coordinates. Any comparison can do).
  4. The timestamp in the second row is in the future of the timestamp in the first row (this avoids redundant calculation where the velocity ends up as a negative).
restof_test1.jq
<samp>let $pi := 3.1415926
let $join :=
for $i in $log[], $j in $log[]
where $i.eventType = "Open Work"
and $i."id" != $j."id"
and $i."operatorID" = $j."operatorID"
and $i.lat>20 and $j.lat>20
and $j.ts>$i.ts
let $lat1 := $i.lat
let $lon1 := $i.lon
let $lat2 := $j.lat
let $lon2 := $j.lon
let $dlat := ($lat2 - $lat1) * $pi div 180
let $dlon := ($lon2 - $lon1) * $pi div 180
let $rlat1 := $lat1 * $pi div 180
let $rlat2 := $lat2 * $pi div 180
let $a := sin($dlat div 2) * sin($dlat div 2) + sin($dlon div 2) * sin($dlon div 2) * cos($rlat1) * cos($rlat2)
let $c := 2 * atan2(sqrt($a), sqrt(1-$a))
let $distance := $c * 3958.8
let $tdiff := $j.ts - $i.ts
let $mph := $distance div $tdiff
return { "id" : $j.id, "eventType" : $j.eventType, "distance":$distance, "mph":$mph}
return [$join]</samp>

Send the JSONiq query to Apache Spark instead of running commands using shell mode

Ideally, there is an easier way to send the queryfile content to Apache Spark and Rumble, rather than opening the file, selecting all, copying and pasting into Rumble's command line.

  1. Instead of: --shell yes
  2. Use: --query-path file.jq --output-path results.out
  3. In a directory where the Rumble JAR file exists, on a system where Apache Spark is installed, run the following command:
pega8@Lubuntu1:~/rumble$ spark-submit --master local[*] --deploy-mode client spark-rumble-1.0.jar --query-path "test2.jq" --output-path "test2.out"
Example test2.out
[ { "id" : "f5b07887-11ef-4f6f-9f0f-efb060bd3cd7", "eventType" : "Open Work", "distance" : 31.6128421996284034764, "mph" : 2634.4035166357 }

The test2.out example can be interpreted as follows:

Two Security Events were recorded in close succession. To have traversed a distance of 31.6 miles (50.9 km) within the time between when those two Security Events were logged requires someone to have traveled at a speed of 2,634 miles per hour (4,329 km/hr).

3 Review solution download

Confirm your work

      



Available in the following mission:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice